{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b32d3842",
   "metadata": {},
   "source": [
    "# SDXL Int8 Quantization Solution by ModelOpt\n",
    "\n",
    "### Note:\n",
    "This notebook requires nvidia-modelopt > 0.9.x, which comes with NeMo framework container > 23.05. An example command to launch the container:\n",
    "\n",
    "```\n",
    "docker run --gpus all -it --rm -v <your_nemo_dir>:/opt/NeMo --shm-size=8g \\\n",
    "     -p 8888:8888 --ulimit memlock=-1 --ulimit \\\n",
    "      stack=67108864 <your_nemo_container>\n",
    "```\n",
    "\n",
    "This tutorial shows how to use ModelOpt to calibrate and quantize the UNet part of the SDXL within NeMo framework. \n",
    "\n",
    "Please note that NeMo provides users with an end-to-end training framework for SDXL, and this quantization pipeline is supposed to work with a `.nemo` checkpoint trained from their own text-image dataset. In this tutorial, a open-source checkpoint is converted to `.nemo` format for illustration purpose."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f8320ca",
   "metadata": {},
   "source": [
    "### Download SDXL checkpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bd436eab",
   "metadata": {},
   "outputs": [],
   "source": [
    "## Download Unet checkpoint\n",
    "! mkdir -p /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet && wget -P /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/unet/diffusion_pytorch_model.safetensors\n",
    "## Download Vae checkpoint  \n",
    "! mkdir -p /sdxl_ckpts/stable-diffusion-xl-base-1.0/vae && wget -P /sdxl_ckpts/stable-diffusion-xl-base-1.0/vae https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/vae/diffusion_pytorch_model.safetensors"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70164e82",
   "metadata": {},
   "source": [
    "### Convert downloaded checkpoint into `.nemo` format"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c9649553",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "FlashAttention Installed\n",
      "[NeMo I 2024-04-24 22:13:11 distributed:42] Initializing torch.distributed with local_rank: 0, rank: 0, world_size: 1\n",
      "GPU available: True (cuda), used: True\n",
      "TPU available: False, using: 0 TPU cores\n",
      "IPU available: False, using: 0 IPUs\n",
      "HPU available: False, using: 0 HPUs\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1183] hidden_size not found in {'precision': 'bf16-mixed', 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': True, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/diffusion_pytorch_model.safetensors'}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/vae/diffusion_pytorch_model.safetensors', 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0'}. Set this in model_parallel_config if using pipeline parallelism.\n",
      "[rank0]:[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:253] Rank 0 has data parallel group : [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:259] Rank 0 has combined group of data parallel and context parallel : [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:264] All data parallel group ranks with context parallel combined: [[0]]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:267] Ranks 0 has data parallel rank: 0\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:284] Rank 0 has context parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:287] All context parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:288] Ranks 0 has context parallel rank: 0\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:299] Rank 0 has model parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:300] All model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:310] Rank 0 has tensor model parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:314] All tensor model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:315] Rank 0 has tensor model parallel rank: 0\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:344] Rank 0 has pipeline model parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:356] Rank 0 has embedding group: [0]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:362] All pipeline model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:363] Rank 0 has pipeline model parallel rank 0\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:364] All embedding group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:13:12 megatron_init:365] Rank 0 has embedding rank: 0\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1183] hidden_size not found in {'precision': 'bf16-mixed', 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': True, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/diffusion_pytorch_model.safetensors'}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/vae/diffusion_pytorch_model.safetensors', 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0'}. Set this in model_parallel_config if using pipeline parallelism.\n",
      "[NeMo I 2024-04-24 22:13:12 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:13:12 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:12 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:13:12 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:12 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:13:12 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:14 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:13:14 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:16 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:13:16 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:18 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:13:18 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:20 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:13:20 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:21 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:13:21 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:23 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:13:23 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:24 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:13:24 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:24 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:13:24 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:13:25 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.wrappers>, cls=<OpenAIWrapper>\n",
      "open_clip_pytorch_model.bin: 100%|██████████| 10.2G/10.2G [01:36<00:00, 106MB/s]\n",
      "Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False\n",
      "Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False\n",
      "Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "making attention of type 'vanilla' with 512 in_channels\n",
      "Working with z of shape (1, 4, 32, 32) = 4096 dimensions.\n",
      "making attention of type 'vanilla' with 512 in_channels\n",
      "AutoencoderKLInferenceWrapper: Following keys are missing during loading VAE weights, which may lead to compromised image quality for a resumed training. Please check the checkpoint you provided.\n",
      "Missing:['encoder.mid.attn_1.proj_out.bias', 'decoder.mid.attn_1.v.weight', 'encoder.mid.attn_1.proj_out.weight', 'decoder.mid.attn_1.proj_out.bias', 'decoder.mid.attn_1.q.weight', 'decoder.mid.attn_1.q.bias', 'encoder.mid.attn_1.q.weight', 'encoder.mid.attn_1.k.weight', 'encoder.mid.attn_1.v.bias', 'decoder.mid.attn_1.k.weight', 'decoder.mid.attn_1.v.bias', 'decoder.mid.attn_1.proj_out.weight', 'encoder.mid.attn_1.q.bias', 'encoder.mid.attn_1.v.weight', 'encoder.mid.attn_1.k.bias', 'decoder.mid.attn_1.k.bias']\n",
      "Unexpected:['encoder.mid.attentions.0.to_k.weight', 'decoder.mid.attentions.0.to_out.0.weight', 'encoder.mid.attentions.0.to_v.bias', 'decoder.mid.attentions.0.to_q.bias', 'encoder.mid.attentions.0.to_q.weight', 'encoder.mid.attentions.0.to_v.weight', 'decoder.mid.attentions.0.to_k.weight', 'decoder.mid.attentions.0.to_v.bias', 'encoder.mid.attentions.0.to_k.bias', 'encoder.mid.attentions.0.to_out.0.bias', 'decoder.mid.attentions.0.to_out.0.bias', 'encoder.mid.attentions.0.to_out.0.weight', 'decoder.mid.attentions.0.to_k.bias', 'decoder.mid.attentions.0.to_v.weight', 'decoder.mid.attentions.0.to_q.weight', 'encoder.mid.attentions.0.to_q.bias']\n",
      "[NeMo I 2024-04-24 22:15:42 convert_hf_ckpt_to_nemo:226] NeMo model saved to: /quantization/sdxl_base.nemo\n"
     ]
    }
   ],
   "source": [
    "WORKDIR = '/quantization'\n",
    "! torchrun /opt/NeMo/examples/multimodal/text_to_image/convert_hf_ckpt_to_nemo.py \\\n",
    "    --model_type sdxl \\\n",
    "    --ckpt_path /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/diffusion_pytorch_model.safetensors \\\n",
    "    --hparams_file /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/conf/sd_xl_base_train.yaml \\\n",
    "    --nemo_file_path $WORKDIR/sdxl_base.nemo"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25817b85",
   "metadata": {},
   "source": [
    "### Run quantization script with default config, and finally the script will export the quantized unet to onnx file.\n",
    "\n",
    "##### Quantization config\n",
    "\n",
    "```yaml\n",
    "quantize\n",
    "  exp_name: nemo_test\n",
    "  n_steps: 20          # number of inference steps\n",
    "  format: 'int8'       # only int8 quantization is supported now\n",
    "  percentile: 1.0      # Control quantization scaling factors (amax) collecting range, meaning that we will collect the minimum amax in the range of `(n_steps * percentile)` steps. Recommendation: 1.0\n",
    "  batch_size: 1        # batch size calling sdxl inference pipeline during calibration\n",
    "  calib_size: 32       # For SDXL, we recommend 32, 64 or 128\n",
    "  quant_level: 2.5     #Which layers to be quantized, 1: `CNNs`, 2: `CNN + FFN`, 2.5: `CNN + FFN + QKV`, 3: `CNN + Linear`. Recommendation: 2, 2.5 and 3, depending on the requirements for image quality & speedup.\n",
    "  alpha: 0.8           # A parameter in SmoothQuant, used for linear layers only. Recommendation: 0.8 for SDXL\n",
    "```\n",
    "\n",
    "##### Onnx export config\n",
    "\n",
    "```yaml\n",
    "onnx_export:\n",
    "  onnx_dir: nemo_onnx    # Path to save onnx files\n",
    "  pretrained_base: ${model.restore_from_path}  # Path to nemo checkpoint for sdxl\n",
    "  quantized_ckpt: nemo.unet.state_dict.${quantize.exp_name}.pt  # Path to save quantized unet checkpoint\n",
    "  format: int8\n",
    "```\n",
    "##### Onnx export config\n",
    "\n",
    "```yaml\n",
    "trt_export:\n",
    "  static_batch: False # static batch engines have better latency\n",
    "  min_batch_size: 1   # minimum batch size when using dynamic batch, has to be the same with max_batch_size and infer.num_samples when using static batch\n",
    "  max_batch_size: 1   # maximum batch size when using dynamic batch, has to be the same with min_batch_size and infer.num_samples when using static batch\n",
    "  int8: True          # Allow engine builder recognize int8 precision\n",
    "  builder_optimization_level: 4  # set to 1-5, higher optimization level means better latency but longer compiling time\n",
    "  trt_engine: int8_unet_xl.plan  # path to save trt engine\n",
    "```\n",
    "\n",
    "The following command restores a pre-trained sdxl model from `$WORKDIR/sdxl_base.nemo` derived from the above step.\n",
    "The quantized U-Net checkpoint is saved to `quantize.quantized_ckpt`, converted onnx file is saved to `onnx_export.onnx_dir` and trt engine is saved to `trt_export.trt_engine`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "d955f6c3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "FlashAttention Installed\n",
      "[NeMo W 2024-04-24 19:42:59 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.\n",
      "    See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.\n",
      "      ret = run_job(\n",
      "    \n",
      "[NeMo W 2024-04-24 19:42:59 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/lightning_fabric/connector.py:563: `precision=16` is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!\n",
      "    \n",
      "Using 16bit Automatic Mixed Precision (AMP)\n",
      "GPU available: True (cuda), used: True\n",
      "TPU available: False, using: 0 TPU cores\n",
      "IPU available: False, using: 0 IPUs\n",
      "HPU available: False, using: 0 HPUs\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n",
      "[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:253] Rank 0 has data parallel group : [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:259] Rank 0 has combined group of data parallel and context parallel : [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:264] All data parallel group ranks with context parallel combined: [[0]]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:267] Ranks 0 has data parallel rank: 0\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:284] Rank 0 has context parallel group: [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:287] All context parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:288] Ranks 0 has context parallel rank: 0\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:299] Rank 0 has model parallel group: [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:300] All model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:310] Rank 0 has tensor model parallel group: [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:314] All tensor model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:315] Rank 0 has tensor model parallel rank: 0\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:344] Rank 0 has pipeline model parallel group: [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:356] Rank 0 has embedding group: [0]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:362] All pipeline model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:363] Rank 0 has pipeline model parallel rank 0\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:364] All embedding group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 19:43:09 megatron_init:365] Rank 0 has embedding rank: 0\n",
      "24-04-24 19:43:09 - PID:1361 - rank:(0, 0, 0, 0) - microbatches.py:39 - INFO - setting number of micro-batches to constant 1\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n",
      "[NeMo I 2024-04-24 19:43:09 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 19:43:09 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:09 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 19:43:09 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:10 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 19:43:10 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:11 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 19:43:11 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:13 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 19:43:13 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:15 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 19:43:15 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:17 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 19:43:17 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:19 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 19:43:19 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:20 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 19:43:20 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:21 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 19:43:21 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:21 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 19:43:21 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 19:43:21 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.wrappers>, cls=<OpenAIWrapper>\n",
      "Loaded ViT-bigG-14 model config.\n",
      "Loading pretrained ViT-bigG-14 weights (laion2b_s39b_b160k).\n",
      "Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False\n",
      "Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False\n",
      "Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "making attention of type 'vanilla' with 512 in_channels\n",
      "Working with z of shape (1, 4, 32, 32) = 4096 dimensions.\n",
      "making attention of type 'vanilla' with 512 in_channels\n",
      "[NeMo I 2024-04-24 19:43:53 nlp_overrides:1155] Model MegatronDiffusionEngine was successfully restored from /quantization/sdxl_base.nemo.\n",
      "Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1\n",
      "----------------------------------------------------------------------------------------------------\n",
      "distributed_backend=nccl\n",
      "All distributed processes registered. Starting with 1 processes\n",
      "----------------------------------------------------------------------------------------------------\n",
      "\n",
      "Building TensorRT engine for /quantization/nemo_onnx/unet.onnx: /quantization/int8_unet_xl.plan\n",
      "[I] Configuring with profiles:[\n",
      "        Profile 0:\n",
      "            {x [min=(1, 4, 128, 128), opt=(4, 4, 128, 128), max=(8, 4, 128, 128)],\n",
      "             y [min=(1, 2816), opt=(4, 2816), max=(8, 2816)],\n",
      "             timesteps [min=(1,), opt=(4,), max=(8,)],\n",
      "             context [min=(1, 80, 2048), opt=(4, 80, 2048), max=(8, 80, 2048)]}\n",
      "    ]\n",
      "\u001b[38;5;11m[W] It looks like some layers in the network have compute precision set, but precision constraints were not enabled. \n",
      "    Precision constraints must be set to 'prefer' or 'obey' for layer compute precision to take effect. \n",
      "    Note: Layers and their requested precisions were: {'/input_blocks.0/input_blocks.0.0/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.0/input_blocks.0.0/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.0/input_blocks.0.0/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.0/input_blocks.0.0/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.3/input_blocks.3.0/op/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.3/input_blocks.3.0/op/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.3/input_blocks.3.0/op/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.3/input_blocks.3.0/op/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.6/input_blocks.6.0/op/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.6/input_blocks.6.0/op/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.6/input_blocks.6.0/op/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.6/input_blocks.6.0/op/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.2/conv/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.2/conv/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.2/conv/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.2/conv/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.2/conv/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.2/conv/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.2/conv/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.2/conv/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/out/out.1/input_quantizer/QuantizeLinear': 'FLOAT', '/out/out.1/input_quantizer/DequantizeLinear': 'INT8', '/out/out.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/out/out.1/weight_quantizer/DequantizeLinear': 'INT8'}\u001b[0m\n",
      "\u001b[38;5;14m[I] Building engine with configuration:\n",
      "    Flags                  | [FP16, INT8]\n",
      "    Engine Capability      | EngineCapability.DEFAULT\n",
      "    Memory Pools           | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n",
      "    Tactic Sources         | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n",
      "    Profiling Verbosity    | ProfilingVerbosity.DETAILED\n",
      "    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n",
      "\u001b[38;5;10m[I] Finished engine building in 881.973 seconds\u001b[0m\n",
      "[I] Saving engine to /quantization/int8_unet_xl.plan\n"
     ]
    }
   ],
   "source": [
    "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_quantize.py model.restore_from_path=$WORKDIR/sdxl_base.nemo onnx_export.onnx_dir=$WORKDIR/nemo_onnx quantize.quantized_ckpt=$WORKDIR/nemo.unet.state_dict.nemo.pt trt_export.trt_engine=$WORKDIR/int8_unet_xl.plan\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f97d6bfa",
   "metadata": {},
   "source": [
    "### Build end to end TRT inference pipeline\n",
    "In order to run an end to end inference with quantized U-Net engine, we need to export and build engines for the other compenents in SDXL, which includes the VAE and two CLIP encoder. The following script restores SDXL from the `nemo` checkpoint and saves the corresponding engine files to `infer.out_path`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "2e8b7742",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "FlashAttention Installed\n",
      "[NeMo W 2024-04-24 22:17:42 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.\n",
      "    See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.\n",
      "      ret = run_job(\n",
      "    \n",
      "[NeMo W 2024-04-24 22:17:42 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/lightning_fabric/connector.py:563: `precision=16` is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!\n",
      "    \n",
      "Using 16bit Automatic Mixed Precision (AMP)\n",
      "GPU available: True (cuda), used: True\n",
      "TPU available: False, using: 0 TPU cores\n",
      "IPU available: False, using: 0 IPUs\n",
      "HPU available: False, using: 0 HPUs\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n",
      "[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:253] Rank 0 has data parallel group : [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:259] Rank 0 has combined group of data parallel and context parallel : [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:264] All data parallel group ranks with context parallel combined: [[0]]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:267] Ranks 0 has data parallel rank: 0\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:284] Rank 0 has context parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:287] All context parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:288] Ranks 0 has context parallel rank: 0\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:299] Rank 0 has model parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:300] All model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:310] Rank 0 has tensor model parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:314] All tensor model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:315] Rank 0 has tensor model parallel rank: 0\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:344] Rank 0 has pipeline model parallel group: [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:356] Rank 0 has embedding group: [0]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:362] All pipeline model parallel group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:363] Rank 0 has pipeline model parallel rank 0\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:364] All embedding group ranks: [[0]]\n",
      "[NeMo I 2024-04-24 22:17:50 megatron_init:365] Rank 0 has embedding rank: 0\n",
      "24-04-24 22:17:50 - PID:703 - rank:(0, 0, 0, 0) - microbatches.py:39 - INFO - setting number of micro-batches to constant 1\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n",
      "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n",
      "[NeMo I 2024-04-24 22:17:50 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:17:50 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:17:50 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:17:50 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:17:51 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:17:51 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:17:53 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:17:53 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:17:54 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:17:54 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:17:56 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:17:56 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:17:58 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:17:58 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:18:00 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n",
      "[NeMo I 2024-04-24 22:18:00 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:18:02 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:18:02 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:18:02 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:18:02 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:18:02 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n",
      "[NeMo I 2024-04-24 22:18:02 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n",
      "[NeMo I 2024-04-24 22:18:02 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.wrappers>, cls=<OpenAIWrapper>\n",
      "Loaded ViT-bigG-14 model config.\n",
      "Loading pretrained ViT-bigG-14 weights (laion2b_s39b_b160k).\n",
      "Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False\n",
      "Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False\n",
      "Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False\n",
      "making attention of type 'vanilla' with 512 in_channels\n",
      "Working with z of shape (1, 4, 32, 32) = 4096 dimensions.\n",
      "making attention of type 'vanilla' with 512 in_channels\n",
      "[NeMo I 2024-04-24 22:18:35 nlp_overrides:1155] Model MegatronDiffusionEngine was successfully restored from /quantization/sdxl_base.nemo.\n",
      "Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1\n",
      "----------------------------------------------------------------------------------------------------\n",
      "distributed_backend=nccl\n",
      "All distributed processes registered. Starting with 1 processes\n",
      "----------------------------------------------------------------------------------------------------\n",
      "\n",
      "[NeMo W 2024-04-24 22:18:36 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py:1184: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      assert y.shape[0] == x.shape[0]\n",
      "    \n",
      "[NeMo W 2024-04-24 22:18:36 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py:209: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      assert x.shape[1] == self.channels\n",
      "    \n",
      "[NeMo W 2024-04-24 22:18:37 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py:145: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      assert x.shape[1] == self.channels\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:17 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/model.py:172: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      w_ = w_ * (int(c) ** (-0.5))\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py:2095: UserWarning: Provided key z_pooled for dynamic axes is not a valid input/output name\n",
      "      warnings.warn(\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/modeling_attn_mask_utils.py:86: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      if input_shape[-1] > 1 or self.sliding_window is not None:\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/modeling_attn_mask_utils.py:162: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      if past_key_values_length > 0:\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py:281: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py:289: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      if causal_attention_mask.size() != (bsz, 1, tgt_len, src_len):\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py:321: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n",
      "      if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):\n",
      "    \n",
      "[NeMo W 2024-04-24 22:22:27 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/torch/onnx/symbolic_opset9.py:5859: UserWarning: Exporting aten::index operator of advanced indexing in opset 14 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.\n",
      "      warnings.warn(\n",
      "    \n",
      "Building TensorRT engine for /quantization/onnx/unet_xl/unet_xl.onnx: /quantization/plan/unet_xl.plan\n",
      "[I] Configuring with profiles:[\n",
      "        Profile 0:\n",
      "            {x [min=(1, 4, 128, 128), opt=(2, 4, 128, 128), max=(8, 4, 128, 128)],\n",
      "             y [min=(1, 2816), opt=(2, 2816), max=(8, 2816)],\n",
      "             timesteps [min=(1,), opt=(2,), max=(8,)],\n",
      "             context [min=(1, 80, 2048), opt=(2, 80, 2048), max=(8, 80, 2048)]}\n",
      "    ]\n",
      "\u001b[38;5;14m[I] Building engine with configuration:\n",
      "    Flags                  | [FP16]\n",
      "    Engine Capability      | EngineCapability.DEFAULT\n",
      "    Memory Pools           | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n",
      "    Tactic Sources         | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n",
      "    Profiling Verbosity    | ProfilingVerbosity.DETAILED\n",
      "    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n",
      "\u001b[38;5;11m[W] Detected layernorm nodes in FP16.\u001b[0m\n",
      "\u001b[38;5;11m[W] Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.\u001b[0m\n",
      "\u001b[38;5;10m[I] Finished engine building in 553.937 seconds\u001b[0m\n",
      "[I] Saving engine to /quantization/plan/unet_xl.plan\n",
      "Building TensorRT engine for /quantization/onnx/vae/vae.onnx: /quantization/plan/vae.plan\n",
      "[I] Configuring with profiles:[\n",
      "        Profile 0:\n",
      "            {z [min=(1, 4, 128, 128), opt=(2, 4, 128, 128), max=(8, 4, 128, 128)]}\n",
      "    ]\n",
      "\u001b[38;5;14m[I] Building engine with configuration:\n",
      "    Flags                  | []\n",
      "    Engine Capability      | EngineCapability.DEFAULT\n",
      "    Memory Pools           | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n",
      "    Tactic Sources         | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n",
      "    Profiling Verbosity    | ProfilingVerbosity.DETAILED\n",
      "    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n",
      "\u001b[38;5;10m[I] Finished engine building in 266.743 seconds\u001b[0m\n",
      "[I] Saving engine to /quantization/plan/vae.plan\n",
      "Building TensorRT engine for /quantization/onnx/clip1/clip1.onnx: /quantization/plan/clip1.plan\n",
      "\u001b[38;5;11m[W] ModelImporter.cpp:409: Make sure input input_ids has Int64 binding.\u001b[0m\n",
      "[I] Configuring with profiles:[\n",
      "        Profile 0:\n",
      "            {input_ids [min=(1, 77), opt=(2, 77), max=(8, 77)]}\n",
      "    ]\n",
      "\u001b[38;5;14m[I] Building engine with configuration:\n",
      "    Flags                  | [FP16]\n",
      "    Engine Capability      | EngineCapability.DEFAULT\n",
      "    Memory Pools           | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n",
      "    Tactic Sources         | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n",
      "    Profiling Verbosity    | ProfilingVerbosity.DETAILED\n",
      "    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n",
      "\u001b[38;5;10m[I] Finished engine building in 16.988 seconds\u001b[0m\n",
      "[I] Saving engine to /quantization/plan/clip1.plan\n",
      "Building TensorRT engine for /quantization/onnx/clip2/clip2.onnx: /quantization/plan/clip2.plan\n",
      "[I] Configuring with profiles:[\n",
      "        Profile 0:\n",
      "            {input_ids [min=(1, 77), opt=(2, 77), max=(8, 77)]}\n",
      "    ]\n",
      "\u001b[38;5;14m[I] Building engine with configuration:\n",
      "    Flags                  | [FP16]\n",
      "    Engine Capability      | EngineCapability.DEFAULT\n",
      "    Memory Pools           | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n",
      "    Tactic Sources         | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n",
      "    Profiling Verbosity    | ProfilingVerbosity.DETAILED\n",
      "    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n",
      "\u001b[38;5;10m[I] Finished engine building in 72.535 seconds\u001b[0m\n",
      "[I] Saving engine to /quantization/plan/clip2.plan\n"
     ]
    }
   ],
   "source": [
    "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_export.py model.restore_from_path=$WORKDIR/sdxl_base.nemo infer.out_path=$WORKDIR"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7eb2d03",
   "metadata": {},
   "source": [
    "### Run TRT inference pipeline with original engines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25737be2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "FlashAttention Installed\n",
      "[NeMo W 2024-04-24 22:46:11 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.\n",
      "    See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.\n",
      "      ret = run_job(\n",
      "    \n",
      "Loading TensorRT engine: /quantization/plan/unet_xl.plan\n",
      "[I] Loading bytes from /quantization/plan/unet_xl.plan\n",
      "unet_xl trt engine loaded successfully\n",
      "Loading TensorRT engine: /quantization/plan/vae.plan\n",
      "[I] Loading bytes from /quantization/plan/vae.plan\n",
      "vae trt engine loaded successfully\n",
      "Loading TensorRT engine: /quantization/plan/clip1.plan\n",
      "[I] Loading bytes from /quantization/plan/clip1.plan\n",
      "clip1 trt engine loaded successfully\n",
      "Loading TensorRT engine: /quantization/plan/clip2.plan\n",
      "[I] Loading bytes from /quantization/plan/clip2.plan\n",
      "clip2 trt engine loaded successfully\n",
      "[NeMo I 2024-04-24 22:46:17 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer>, cls=<LegacyDDPMDiscretization>\n",
      "[NeMo I 2024-04-24 22:46:17 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.guiders>, cls=<VanillaCFG>\n",
      "[NeMo I 2024-04-24 22:46:17 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sampling_utils>, cls=<NoDynamicThresholding>\n",
      "##############################  Sampling setting  ##############################\n",
      "Sampler: EulerEDMSampler\n",
      "Discretization: LegacyDDPMDiscretization\n",
      "Guider: VanillaCFG\n",
      "Sampling with EulerEDMSampler for 41 steps:  98%|▉| 40/41 [00:24<00:00,  1.60it/\n",
      "This batch takes 27.204587490297854s\n",
      "[NeMo I 2024-04-24 22:46:45 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer>, cls=<LegacyDDPMDiscretization>\n",
      "[NeMo I 2024-04-24 22:46:45 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.guiders>, cls=<VanillaCFG>\n",
      "[NeMo I 2024-04-24 22:46:45 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sampling_utils>, cls=<NoDynamicThresholding>\n",
      "##############################  Sampling setting  ##############################\n",
      "Sampler: EulerEDMSampler\n",
      "Discretization: LegacyDDPMDiscretization\n",
      "Guider: VanillaCFG\n",
      "Sampling with EulerEDMSampler for 41 steps:  98%|▉| 40/41 [00:25<00:00,  1.57it/\n",
      "This batch takes 25.58329666685313s\n",
      "[NeMo I 2024-04-24 22:47:14 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer>, cls=<LegacyDDPMDiscretization>\n",
      "[NeMo I 2024-04-24 22:47:14 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.guiders>, cls=<VanillaCFG>\n",
      "[NeMo I 2024-04-24 22:47:14 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sampling_utils>, cls=<NoDynamicThresholding>\n",
      "##############################  Sampling setting  ##############################\n",
      "Sampler: EulerEDMSampler\n",
      "Discretization: LegacyDDPMDiscretization\n",
      "Guider: VanillaCFG\n",
      "Sampling with EulerEDMSampler for 41 steps:  98%|▉| 40/41 [00:25<00:00,  1.55it/\n",
      "This batch takes 25.87396944500506s\n",
      "[NeMo I 2024-04-24 22:47:44 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer>, cls=<LegacyDDPMDiscretization>\n",
      "[NeMo I 2024-04-24 22:47:44 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.guiders>, cls=<VanillaCFG>\n",
      "[NeMo I 2024-04-24 22:47:44 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sampling_utils>, cls=<NoDynamicThresholding>\n",
      "##############################  Sampling setting  ##############################\n",
      "Sampler: EulerEDMSampler\n",
      "Discretization: LegacyDDPMDiscretization\n",
      "Guider: VanillaCFG\n",
      "Sampling with EulerEDMSampler for 41 steps:  98%|▉| 40/41 [00:25<00:00,  1.54it/\n",
      "This batch takes 26.03419069480151s\n",
      "[NeMo I 2024-04-24 22:48:13 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer>, cls=<LegacyDDPMDiscretization>\n",
      "[NeMo I 2024-04-24 22:48:13 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.guiders>, cls=<VanillaCFG>\n",
      "[NeMo I 2024-04-24 22:48:13 utils:108] Getting module=<nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sampling_utils>, cls=<NoDynamicThresholding>\n",
      "##############################  Sampling setting  ##############################\n",
      "Sampler: EulerEDMSampler\n",
      "Discretization: LegacyDDPMDiscretization\n",
      "Guider: VanillaCFG\n",
      "Sampling with EulerEDMSampler for 41 steps:  71%|▋| 29/41 [00:18<00:07,  1.52it/"
     ]
    }
   ],
   "source": [
    "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_trt_inference.py \\\n",
    "    out_path=$WORKDIR/trt_output_fp16 \\\n",
    "    unet_xl=$WORKDIR/plan/unet_xl.plan \\\n",
    "    vae=$WORKDIR/plan/vae.plan \\\n",
    "    clip1=$WORKDIR/plan/clip1.plan \\\n",
    "    clip2=$WORKDIR/plan/clip2.plan\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d11bbe7d",
   "metadata": {},
   "source": [
    "### Run TRT inference pipeline with quantized U-Net engine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "3f2263b4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "^C\n",
      "Traceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3109, in _dep_map\n",
      "[2024-04-24 19:42:46,104] torch.distributed.elastic.agent.server.api: [WARNING] Received Signals.SIGINT death signal, shutting down workers\n",
      "[2024-04-24 19:42:46,104] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 1300 closing signal SIGINT\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 2902, in __getattr__\n",
      "    raise AttributeError(attr)\n",
      "AttributeError: _DistInfoDistribution__dep_map\n",
      "\n",
      "During handling of the above exception, another exception occurred:\n",
      "\n",
      "Traceback (most recent call last):\n",
      "  File \"/opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_trt_inference.py\", line 25, in <module>\n",
      "    from nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser import DiscreteDenoiser\n",
      "  File \"/opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/denoiser.py\", line 17, in <module>\n",
      "    from nemo.collections.multimodal.parts.stable_diffusion.utils import append_dims, instantiate_from_config\n",
      "  File \"/opt/NeMo/nemo/collections/multimodal/parts/stable_diffusion/utils.py\", line 25, in <module>\n",
      "    from nemo.utils import logging\n",
      "  File \"/opt/NeMo/nemo/utils/__init__.py\", line 31, in <module>\n",
      "    from nemo.utils.lightning_logger_patch import add_memory_handlers_to_pl_logger\n",
      "  File \"/opt/NeMo/nemo/utils/lightning_logger_patch.py\", line 18, in <module>\n",
      "    import pytorch_lightning as pl\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/__init__.py\", line 27, in <module>\n",
      "    from pytorch_lightning.callbacks import Callback  # noqa: E402\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/__init__.py\", line 14, in <module>\n",
      "    from pytorch_lightning.callbacks.batch_size_finder import BatchSizeFinder\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/batch_size_finder.py\", line 26, in <module>\n",
      "    from pytorch_lightning.callbacks.callback import Callback\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/callback.py\", line 22, in <module>\n",
      "    from pytorch_lightning.utilities.types import STEP_OUTPUT\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/types.py\", line 41, in <module>\n",
      "    from torchmetrics import Metric\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/torchmetrics/__init__.py\", line 22, in <module>\n",
      "    from torchmetrics import functional  # noqa: E402\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/torchmetrics/functional/__init__.py\", line 121, in <module>\n",
      "    from torchmetrics.functional.text._deprecated import _bleu_score as bleu_score\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/torchmetrics/functional/text/__init__.py\", line 49, in <module>\n",
      "    if _TRANSFORMERS_GREATER_EQUAL_4_4:\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py\", line 164, in __bool__\n",
      "    self._check_available()\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py\", line 158, in _check_available\n",
      "    self._check_requirement()\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py\", line 132, in _check_requirement\n",
      "    pkg_resources.require(self.requirement)\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 968, in require\n",
      "    needed = self.resolve(parse_requirements(requirements))\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 834, in resolve\n",
      "    new_requirements = dist.requires(req.extras)[::-1]\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 2822, in requires\n",
      "    dm = self._dep_map\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3111, in _dep_map\n",
      "    self.__dep_map = self._compute_dependencies()\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3133, in _compute_dependencies\n",
      "    dm[s_extra] = [r for r in reqs_for_extra(extra) if r not in common]\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3133, in <listcomp>\n",
      "    dm[s_extra] = [r for r in reqs_for_extra(extra) if r not in common]\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3125, in reqs_for_extra\n",
      "    if not req.marker or req.marker.evaluate({'extra': extra}):\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/_vendor/packaging/markers.py\", line 252, in evaluate\n",
      "    return _evaluate_markers(self._markers, current_environment)\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/_vendor/packaging/markers.py\", line 164, in _evaluate_markers\n",
      "    return any(all(item) for item in groups)\n",
      "  File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/_vendor/packaging/markers.py\", line 164, in <genexpr>\n",
      "    return any(all(item) for item in groups)\n",
      "KeyboardInterrupt\n"
     ]
    }
   ],
   "source": [
    "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_trt_inference.py \\\n",
    "    out_path=$WORKDIR/trt_output_int8 \\\n",
    "    unet_xl=$WORKDIR/int8_unet_xl.plan \\\n",
    "    vae=$WORKDIR/plan/vae.plan \\\n",
    "    clip1=$WORKDIR/plan/clip1.plan \\\n",
    "    clip2=$WORKDIR/plan/clip2.plan"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c48c21dd",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
